Trade Oo between Compression and Search times in Compact Suux Array ?

نویسنده

  • Veli Mäkinen
چکیده

Suux array is a widely used full-text index that allows fast searches on the text. It is constructed by sorting all suuxes of the text in the lexicographic order and storing pointers to the suuxes in this order. Binary search is used for fast searches on the suux array. Compact suux array is a compressed form of the suux array that still allows binary searches, but the search times are also dependent on the compression. In this paper, we answer some open questions concerning the compact suux array, and study practical issues, such as the trade oo between compression and search times, and show how to reduce the space requirement of the construction. Experimental results are provided in comparison with other search methods. The results show that usually the size of a compact suux array is less than twice the size of the text, while the search times are still comparable to those of suux arrays.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Indexing Compressed Text

We present a technique to build an index based on suux arrays for compressed texts. We also propose a compression scheme for textual databases based on words that generates a compression code that preserves the lexicographical ordering of the text words. As a consequence it permits the sorting of the compressed strings to generate the suux array without decompressing. As the compressed text is ...

متن کامل

Compact Suux Array ?

Suux array is a data structure that can be used to index a large text le so that queries of its content can be answered quickly. Basically a suux array is an array of all suuxes of the text in the lexico-graphic order. Whether or not a word occurs in the text can be answered in logarithmic time by binary search over the suux array. In this work we present a method to compress a suux array such ...

متن کامل

Suux Binary Search Trees and Suux Arrays

Suux arrays and suux binary search trees are two data structures that have been proposed as alternatives to the classical suux tree to facilitate eecient on-line string searching. Here, we explore the relationship between these two structures. In particular, we present an alternative view of a suux array, with its auxiliary information, as a perfectly balanced suux binary search tree, and descr...

متن کامل

Space Eecient Suux Trees

We give the rst representation of a suux tree that uses n lg n + O(n) bits of space and supports searching for a pattern string in the given text (from a xed size alphabet) in O(m) time, where n is the size of the text and m is the length of the pattern. The structure is quite simple and answers a question raised by Muthukrishnan in 22]. Previous compact representations of suux trees had either...

متن کامل

A Generalized Suffix Tree and its (Un)expected Asymptotic Behaviors

Suux trees nd several applications in computer science and telecommunications, most notably in algorithms on strings, data compressions and codes. Despite this, very little is known about their typical behaviors. In a probabilistic framework, we consider a family of suux trees { further called b-suux trees { built from the rst n suuxes of a random word. In this family a noncompact suux tree (i....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007